Cs 6604: Data Mining 2 Relational Apriori 2.1 Queries

نویسندگان

  • Naren Ramakrishnan
  • Xiaomo Liu
چکیده

1 Overview In the previous several lectures, we mainly discussed algorithms for ILP, i.e., supervised learning of rela-tional predicates. In this class, we focus on unsupervised algorithms for learning relational patterns. In particular, we look at algorithms that are the relational equivalent of traditional enumerative search algorithms. Take, for instance, the most famous algorithm in the association rules literature, i.e., Apriori. Its relational twin, call it Relational Apriori, is its generalization to first-order logic. To recall the details of Apriori, one of the properties it exploits is anti-monotonicity, namely if a set of items X does not have support, a set Y ⊇ X cannot have support. Suppose we have a relational database consisting of three tables, i.e., " Customer " , " Parent " , and " Buys " as follows: Customer ID allen bill carol diana Parent ID Child ID allen bill allen carol bill zoe carol diana Customer ID Item allen wine bill candy bill pizza diana pizza Table 1: Relation database RD with customer information The type of pattern we try to mine can be viewed as queries. We say that a query-pattern matches the database if the set of tuples returned by the query is not empty. Take an ordinary SQL query for example, if we need to inquire the relational database for the customer who buys its child " candy " : Let us convert this query into predicate logic form. Thus, the above SQL query is translated into the following logical clause form, Q 1 .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Association Rules in the Relational Calculus

One of the most utilized data mining tasks is the search for association rules. Association rules represent significant relationships between items in transactions. We extend the concept of association rule to represent a much broader class of associations, which we refer to as entity-relationship rules. Semantically, entity-relationship rules express associations between properties of related ...

متن کامل

DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses

This work presents DWMiner, an association rules efficient mining tool to process data directly over a relational DBMS data warehouse. DWMiner executes the Apriori algorithm as SQL queries in parallel, using a database PC Cluster middleware developed for SQL query optimization in OLAP applications. DWMiner combines intraand inter-query parallelism in order to reduce the total time needed to fin...

متن کامل

Multi-Relational Data Mining

An important aspect of data mining algorithms and systems is that they should scale well to large databases. A consequence of this is that most data mining tools are based on machine learning algorithms that work on data in attribute-value format. Experience has proven that such ’single-table’ mining algorithms indeed scale well. The downside of this format is, however, that more complex patter...

متن کامل

Efficient Mining for Association Rules with Relational Database Systems

With the tremendous growth of large-scale data repositories, a need for integrating the exploratory techniques of data mining with the capabilities of relational systems to efficiently handle large volumes of data has now risen. In this paper, we look at the performance of the most prevalent association rule mining algorithm Apriori, with IBM’s DB2 Universal Database system. We show that a mult...

متن کامل

Evaluation of Common Counting Method for Concurrent Data Mining Queries

Data mining queries are often submitted concurrently to the data mining system. The data mining system should take advantage of overlapping of the mined datasets. In this paper we focus on frequent itemset mining and we discuss and experimentally evaluate the implementation of the Common Counting method on top of the Apriori algorithm. The general idea of Common Counting is to reduce the number...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007